Grounding annotations in published literature with an emphasis on the functional roles used in metabolic models

نویسندگان

  • Erik Binter
  • Scott Binter
  • Terry Disz
  • Elizabeth Kalmanek
  • Alexander Powers
  • Gordon D. Pusch
  • Julie Turgeon
چکیده

Accurate genome annotations in databases are a critical resource available to the scientific community for analysis and research. Inaccurate and inconsistent annotations exist as a result of errors generated from mass automated annotation, and currently act as a barrier to the application of bioinformatics. The purpose of this effort was to improve the SEED by improving the connection of functional roles to literature references. Direct literature references (DLits), found through searches of PubMed and other online databases such as SwissProt, were attached to protein sequences within the PubSEED to provide literature support for the roughly 2,500 distinct functional roles used to construct metabolic models within the Model SEED. Only DLits in which a researcher asserted the function of a protein were attached to sequences. Starting from a list of 1,072 functional roles that did not previously have DLit support, we were able to connect sequences to literature for 655 functional roles, at least 484 of which were in the original list of unsupported roles. When added to the existing set of sequences having DLits, the resulting set of DLit-sequence pairs (the foundation set) now connects approximately 4,300 DLits to approximately 5,600 distinct protein sequences obtained from approximately 16,000 genes (some of these genes have identical protein sequences). From the foundation set, we construct projection sets such that each set contains one member of the foundation set and projections of its functional role onto similar genes. The projection sets revealed 120 inconsistent annotations within the SEED. Two types of inconsistencies were corrected through manual annotation in the PubSEED: instances in which two identical protein sequences had been annotated with different functions, and instances when projected functions contradicted previous annotations. 26,785 changes to gene function assignment, 219 of which were to previously uncharacterized proteins, resulted in a more consistent and accurate set of input data from which to construct revised metabolic models within the Model SEED.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Personalized nutrition and its roles on some metabolic disorders: A narrative review

Introduction: Considering an individual’s characteristics such as genetics along with other characteristics and dietary habits can help to provide an effective diet for prevention and controlling metabolic disorders. Accordingly, in the present study, we aimed to review evidence on personalized nutrition (PN) and its roles in metabolic disorders. Materials and Methods: In the present narrative ...

متن کامل

U-Bending Analysis with an Emphasis on Influence of Hardening Models

In this paper the effect of different hardening models in simulating the U-bending process for AA5754-O and DP-Steel, taking a benchmark of NUMISHEET 93 2-D draw bending, has been discussed. The hardening models considered in simulations are: isotropic hardening, pure (linear) kinematic hardening and combined (nonlinear kinematic) hardening. The influence of hardening models on predicting sprin...

متن کامل

Climate Changes and Vector-Borne Diseases With an Emphasis on Parasitic Diseases: A Narrative Review

Background and Objectives: The issue of climate change has currently become a critical concern for the global community as it affects the transmission and spread of a wide range of diseases. This study aims to examine the literature and scientific evidence concerning the impact of climate change on vector-borne diseases. Methods: In this review research, a comprehensive search and review of te...

متن کامل

Genome-Scale Metabolic Network Models of Bacillus Species Suggest that Model Improvement is Necessary for Biotechnological Applications

Background: A genome-scale metabolic network model (GEM) is a mathematical representation of an organism’s metabolism. Today, GEMs are popular tools for computationally simulating the biotechnological processes and for predicting biochemical properties of (engineered) strains.Objectives: In the present study, we have evaluated the predictive power of two ...

متن کامل

Impact of Ocean-Land Mixed Propagation Path on Equivalent Circuit of Grounding Rods

In this paper, the effect of ocean-land mixed propagation path on the lightning performance of grounding rods is investigated. This effect is focused on two problems. The first is extracting exact equivalent circuit of grounding rods in the presence of oceans. The equivalent circuit can be used in transient analysis of power systems in the neighboring oceans. In the second one, this effect on t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 2  شماره 

صفحات  -

تاریخ انتشار 2012